-
Notifications
You must be signed in to change notification settings - Fork 232
Allow more than one http rest & grpc listeners #3749
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
docs/performance_tuning.md
Outdated
|
|
||
| ## Network Configuration for Optimal Performance | ||
|
|
||
| When clients connect to the server using hostname resolution (particularly "localhost"), the system may attempt IPv6 resolution first before falling back to IPv4. If IPv6 is disabled, misconfigured, or unavailable, this can cause connection timeouts and delays before the IPv4 fallback occurs, which is especially noticeable when minimizing time to first token in generative AI applications. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default, OVMS endpoints are bound to all ipv4 addresses. On same systems, which route localhost name to ipv6 address, it might cause extra time on the client side to switch to ipv4. It can effectively results with extra 1-2s latency.
It can be overcome by switching the API URL to http://127.0.0.1 instead.
Alternatively ipv6 can be enabled in the model server using --grpc_bind_address and --rest_bind_address.
For example:
--grpc_bind_address 127.0.0.1,::1 --rest_bind_address 127.0.0.1,::1
or
--grpc_bind_address 0.0.0.0,:: --rest_bind_address 0.0.0.0,::
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
🛠 Summary
CVS-170537